• Sharp analysis of power iteration for tensor PCA

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sharp analysis of power iteration for tensor PCA Yuchen Wu , Kangjie Zhou 25(195 1 42, 2024. Abstract We investigate the power iteration algorithm for the tensor PCA model introduced in Richard and Montanari 2014 Previous work studying the properties of tensor power iteration is either limited to a constant number of iterations , or requires a non-trivial data-independent initialization . In this paper , we move beyond these limitations and analyze the dynamics of randomly initialized tensor power iteration up to polynomially many steps . Our contributions are threefold : First , we establish sharp

  • BenchMARL: Benchmarking Multi-Agent Reinforcement Learning

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us BenchMARL : Benchmarking Multi-Agent Reinforcement Learning Matteo Bettini , Amanda Prorok , Vincent Moens 25(217 1 10, 2024. Abstract The field of Multi-Agent Reinforcement Learning MARL is currently facing a reproducibility crisis . While solutions for standardized reporting have been proposed to address the issue , we still lack a benchmarking tool that enables standardization and reproducibility , while leveraging cutting-edge Reinforcement Learning RL implementations . In this paper , we introduce BenchMARL , the first MARL training library created to enable standardized benchmarking across

  • Optimal Locally Private Nonparametric Classification with Public Data

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Optimal Locally Private Nonparametric Classification with Public Data Yuheng Ma , Hanfang Yang 25(167 1 62, 2024. Abstract In this work , we investigate the problem of public data assisted non-interactive Local Differentially Private LDP learning with a focus on non-parametric classification . Under the posterior drift assumption , we for the first time derive the mini-max optimal convergence rate with LDP constraint . Then , we present a novel approach , the locally differentially private classification tree , which attains the mini-max optimal convergence rate . Furthermore , we design a

  • Split Conformal Prediction and Non-Exchangeable Data

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Split Conformal Prediction and Non-Exchangeable Data Roberto I . Oliveira , Paulo Orenstein , Thiago Ramos , João Vitor Romano 25(225 1 38, 2024. Abstract Split conformal prediction CP is arguably the most popular CP method for uncertainty quantification , enjoying both academic interest and widespread deployment . However , the original theoretical analysis of split CP makes the crucial assumption of data exchangeability , which hinders many real-world applications . In this paper , we present a novel theoretical framework based on concentration inequalities and decoupling properties of the data ,

  • On the Intrinsic Structures of Spiking Neural Networks

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us On the Intrinsic Structures of Spiking Neural Networks Shao-Qun Zhang , Jia-Yi Chen , Jin-Hui Wu , Gao Zhang , Huan Xiong , Bin Gu , Zhi-Hua Zhou 25(194 1 74, 2024. Abstract Recent years have emerged a surge of interest in spiking neural networks SNNs The performance of SNNs hinges not only on searching apposite architectures and connection weights , similar to conventional artificial neural networks , but also on the meticulous configuration of their intrinsic structures . However , there has been a dearth of comprehensive studies examining the impact of intrinsic structures thus developers often

  • Invariant Physics-Informed Neural Networks for Ordinary Differential Equations

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Invariant Physics-Informed Neural Networks for Ordinary Differential Equations Shivam Arora , Alex Bihlo , Francis Valiquette 25(233 1 24, 2024. Abstract Physics-informed neural networks have emerged as a prominent new method for solving differential equations . While conceptually straightforward , they often suffer training difficulties that lead to relatively large discretization errors or the failure to obtain correct solutions . In this paper we introduce invariant physics-informed neural networks for ordinary differential equations that admit a finite-dimensional group of Lie point symmetries .

  • Bayesian Regression Markets

    Updated: 2024-08-31 23:46:18
    Although machine learning tasks are highly sensitive to the quality of input data, relevant datasets can often be challenging for firms to acquire, especially when held privately by a variety of owners. For instance, if these owners are competitors in a downstream market, they may be reluctant to share information. Focusing on supervised learning for regression tasks, we develop a regression market to provide a monetary incentive for data sharing. Our mechanism adopts a Bayesian framework, allowing us to consider a more general class of regression tasks. We present a thorough exploration of the market properties, and show that similar proposals in literature expose the market agents to sizeable financial risks, which can be mitigated in our setup.

  • Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance

    Updated: 2024-08-31 23:46:18
    : , Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Three-Way Trade-Off in Multi-Objective Learning : Optimization , Generalization and Conflict-Avoidance Lisha Chen , Heshan Fernando , Yiming Ying , Tianyi Chen 25(193 1 53, 2024. Abstract Multi-objective learning MOL often arises in machine learning problems when there are multiple data modalities or tasks . One critical challenge in MOL is the potential conflict among different objectives during the optimization process . Recent works have developed various dynamic weighting algorithms for MOL , where the central idea is to find an update direction that avoids conflicts among objectives . Albeit

  • Distribution Learning via Neural Differential Equations: A Nonparametric Statistical Perspective

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Distribution Learning via Neural Differential Equations : A Nonparametric Statistical Perspective Youssef Marzouk , Zhi Robert Ren , Sven Wang , Jakob Zech 25(232 1 61, 2024. Abstract Ordinary differential equations ODEs via their induced flow maps , provide a powerful framework to parameterize invertible transformations for representing complex probability distributions . While such models have achieved enormous success in machine learning , little is known about their statistical properties . This work establishes the first general nonparametric statistical convergence analysis for distribution

  • Parallel-in-Time Probabilistic Numerical ODE Solvers

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Parallel-in-Time Probabilistic Numerical ODE Solvers Nathanael Bosch , Adrien Corenflos , Fatemeh Yaghoobi , Filip Tronarp , Philipp Hennig , Simo Särkkä 25(206 1 27, 2024. Abstract Probabilistic numerical solvers for ordinary differential equations ODEs treat the numerical simulation of dynamical systems as problems of Bayesian state estimation . Aside from producing posterior distributions over ODE solutions and thereby quantifying the numerical approximation error of the method itself , one less-often noted advantage of this formalism is the algorithmic flexibility gained by formulating numerical

  • Neural Feature Learning in Function Space

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Neural Feature Learning in Function Space Xiangxiang Xu , Lizhong Zheng 25(142 1 76, 2024. Abstract We present a novel framework for learning system design with neural feature extractors . First , we introduce the feature geometry , which unifies statistical dependence and feature representations in a function space equipped with inner products . This connection defines function-space concepts on statistical dependence , such as norms , orthogonal projection , and spectral decomposition , exhibiting clear operational meanings . In particular , we associate each learning setting with a dependence

  • Learning to Warm-Start Fixed-Point Optimization Algorithms

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Learning to Warm-Start Fixed-Point Optimization Algorithms Rajiv Sambharya , Georgina Hall , Brandon Amos , Bartolomeo Stellato 25(166 1 46, 2024. Abstract We introduce a machine-learning framework to warm-start fixed-point optimization algorithms . Our architecture consists of a neural network mapping problem parameters to warm starts , followed by a predefined number of fixed-point iterations . We propose two loss functions designed to either minimize the fixed-point residual or the distance to a ground truth solution . In this way , the neural network predicts warm starts with the end-to-end goal

  • Multi-Objective Neural Architecture Search by Learning Search Space Partitions

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Multi-Objective Neural Architecture Search by Learning Search Space Partitions Yiyang Zhao , Linnan Wang , Tian Guo 25(177 1 41, 2024. Abstract Deploying deep learning models requires taking into consideration neural network metrics such as model size , inference latency , and FLOPs , aside from inference accuracy . This results in deep learning model designers leveraging multi-objective optimization to design effective deep neural networks in multiple criteria . However , applying multi-objective optimizations to neural architecture search NAS is nontrivial because NAS tasks usually have a huge

  • Variational Estimators of the Degree-corrected Latent Block Model for Bipartite Networks

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Variational Estimators of the Degree-corrected Latent Block Model for Bipartite Networks Yunpeng Zhao , Ning Hao , Ji Zhu 25(150 1 42, 2024. Abstract Bipartite graphs are ubiquitous across various scientific and engineering fields . Simultaneously grouping the two types of nodes in a bipartite graph via biclustering represents a fundamental challenge in network analysis for such graphs . The latent block model LBM is a commonly used model-based tool for biclustering . However , the effectiveness of the LBM is often limited by the influence of row and column sums in the data matrix . To address this

  • PyGOD: A Python Library for Graph Outlier Detection

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us PyGOD : A Python Library for Graph Outlier Detection Kay Liu , Yingtong Dou , Xueying Ding , Xiyang Hu , Ruitong Zhang , Hao Peng , Lichao Sun , Philip S . Yu 25(141 1 9, 2024. Abstract PyGOD is an open-source Python library for detecting outliers in graph data . As the first comprehensive library of its kind , PyGOD supports a wide array of leading graph-based methods for outlier detection under an easy-to-use , well-documented API designed for use by both researchers and practitioners . PyGOD provides modularized components of the different detectors implemented so that users can easily customize

  • Classification of Data Generated by Gaussian Mixture Models Using Deep ReLU Networks

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Classification of Data Generated by Gaussian Mixture Models Using Deep ReLU Networks Tian-Yi Zhou , Xiaoming Huo 25(190 1 54, 2024. Abstract This paper studies the binary classification of unbounded data from mathbb R d$ generated under Gaussian Mixture Models GMMs using deep ReLU neural networks . We obtain for the first time non-asymptotic upper bounds and convergence rates of the excess risk excess misclassification error for the classification without restrictions on model parameters . While the majority of existing generalization analysis of classification algorithms relies on a bounded domain ,

  • Fermat Distances: Metric Approximation, Spectral Convergence, and Clustering Algorithms

    Updated: 2024-08-31 23:46:18
    : , , Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Fermat Distances : Metric Approximation , Spectral Convergence , and Clustering Algorithms Nicolás García Trillos , Anna Little , Daniel McKenzie , James M . Murphy 25(176 1 65, 2024. Abstract We analyze the convergence properties of Fermat distances , a family of density-driven metrics defined on Riemannian manifolds with an associated probability measure . Fermat distances may be defined either on discrete samples from the underlying measure , in which case they are random , or in the continuum setting , where they are induced by geodesics under a density-distorted Riemannian metric . We

  • Nonparametric Regression Using Over-parameterized Shallow ReLU Neural Networks

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Nonparametric Regression Using Over-parameterized Shallow ReLU Neural Networks Yunfei Yang , Ding-Xuan Zhou 25(165 1 35, 2024. Abstract It is shown that over-parameterized neural networks can achieve minimax optimal rates of convergence up to logarithmic factors for learning functions from certain smooth function classes , if the weights are suitably constrained or regularized . Specifically , we consider the nonparametric regression of estimating an unknown d$-variate function by using shallow ReLU neural networks . It is assumed that the regression function is from the H older space with smoothness

  • Scalable High-Dimensional Multivariate Linear Regression for Feature-Distributed Data

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Scalable High-Dimensional Multivariate Linear Regression for Feature-Distributed Data Shuo-Chieh Huang , Ruey S . Tsay 25(205 1 59, 2024. Abstract Feature-distributed data , referred to data partitioned by features and stored across multiple computing nodes , are increasingly common in applications with a large number of features . This paper proposes a two-stage relaxed greedy algorithm TSRGA for applying multivariate linear regression to such data . The main advantage of TSRGA is that its communication complexity does not depend on the feature dimension , making it highly scalable to very large

  • Sparse Graphical Linear Dynamical Systems

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sparse Graphical Linear Dynamical Systems Emilie Chouzenoux , Victor Elvira 25(223 1 53, 2024. Abstract Time-series datasets are central in machine learning with applications in numerous fields of science and engineering , such as biomedicine , Earth observation , and network analysis . Extensive research exists on state-space models SSMs which are powerful mathematical tools that allow for probabilistic and interpretable learning on time series . Learning the model parameters in SSMs is arguably one of the most complicated tasks , and the inclusion of prior knowledge is known to both ease the

  • Interpretable algorithmic fairness in structured and unstructured data

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Interpretable algorithmic fairness in structured and unstructured data Hari Bandi , Dimitris Bertsimas , Thodoris Koukouvinos , Sofie Kupiec 25(215 1 42, 2024. Abstract Systemic bias with respect to gender and race is prevalent in datasets , making it challenging to train classification models that are accurate and alleviate bias . We propose a unified method for alleviating bias in structured and unstructured data , based on a novel optimization approach for optimally flipping outcome labels and training classification models simultaneously . In the case of structured data , we introduce constraints

  • FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us FedCBO : Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization José A . Carrillo , Nicolás García Trillos , Sixu Li , Yuhua Zhu 25(214 1 51, 2024. Abstract Federated learning is an important framework in modern machine learning that seeks to integrate the training of learning models from multiple users , each user having their own local data set , in a way that is sensitive to data privacy and to communication loss constraints . In clustered federated learning , one assumes an additional unknown group structure among users , and the goal is to train models

  • Statistical Inference for Fairness Auditing

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Statistical Inference for Fairness Auditing John J . Cherian , Emmanuel J . Candès 25(149 1 49, 2024. Abstract Before deploying a black-box model in high-stakes problems , it is important to evaluate the model’s performance on sensitive subpopulations . For example , in a recidivism prediction task , we may wish to identify demographic groups for which our prediction model has unacceptably high false positive rates or certify that no such groups exist . In this paper , we frame this task , often referred to as fairness auditing , in terms of multiple hypothesis testing . We show how the bootstrap can

  • Differentially Private Topological Data Analysis

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Differentially Private Topological Data Analysis Taegyu Kang , Sehwan Kim , Jinwon Sohn , Jordan Awan 25(189 1 42, 2024. Abstract This paper is the first to attempt differentially private DP topological data analysis TDA producing near-optimal private persistence diagrams . We analyze the sensitivity of persistence diagrams in terms of the bottleneck distance , and we show that the commonly used Cech complex has sensitivity that does not decrease as the sample size n$ increases . This makes it challenging for the persistence diagrams of Cech complexes to be privatized . As an alternative , we show

  • Spherical Rotation Dimension Reduction with Geometric Loss Functions

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Spherical Rotation Dimension Reduction with Geometric Loss Functions Hengrui Luo , Jeremy E . Purvis , Didong Li 25(175 1 55, 2024. Abstract Modern datasets often exhibit high dimensionality , yet the data reside in low-dimensional manifolds that can reveal underlying geometric structures critical for data analysis . A prime example of such a dataset is a collection of cell cycle measurements , where the inherently cyclical nature of the process can be represented as a circle or sphere . Motivated by the need to analyze these types of datasets , we propose a nonlinear dimension reduction method ,

  • Efficient Convex Algorithms for Universal Kernel Learning

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Efficient Convex Algorithms for Universal Kernel Learning Aleksandr Talitckii , Brendon Colbert , Matthew M . Peet 25(203 1 40, 2024. Abstract The accuracy and complexity of machine learning algorithms based on kernel optimization are determined by the set of kernels over which they are able to optimize . An ideal set of kernels should : admit a linear parameterization for tractability be dense in the set of all kernels for robustness be universal for accuracy Recently , a framework was proposed for using positive matrices to parameterize a class of positive semi-separable kernels . Although this

  • Nonparametric Copula Models for Multivariate, Mixed, and Missing Data

    Updated: 2024-08-31 23:46:18
    , , Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Nonparametric Copula Models for Multivariate , Mixed , and Missing Data Joseph Feldman , Daniel R . Kowal 25(164 1 50, 2024. Abstract Modern data sets commonly feature both substantial missingness and many variables of mixed data types , which present significant challenges for estimation and inference . Complete case analysis , which proceeds using only the observations with fully-observed variables , is often severely biased , while model-based imputation of missing values is limited by the ability of the model to capture complex dependencies among possibly many variables of mixed data types .

  • Adjusted Wasserstein Distributionally Robust Estimator in Statistical Learning

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Adjusted Wasserstein Distributionally Robust Estimator in Statistical Learning Yiling Xie , Xiaoming Huo 25(148 1 40, 2024. Abstract We propose an adjusted Wasserstein distributionally robust estimator---based on a nonlinear transformation of the Wasserstein distributionally robust WDRO estimator in statistical learning . The classic WDRO estimator is asymptotically biased , while our adjusted WDRO estimator is asymptotically unbiased , resulting in a smaller asymptotic mean squared error . Further , under certain conditions , our proposed adjustment technique provides a general principle to de-bias

  • Statistical analysis for a penalized EM algorithm in high-dimensional mixture linear regression model

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Statistical analysis for a penalized EM algorithm in high-dimensional mixture linear regression model Ning Wang , Xin Zhang , Qing Mai 25(222 1 85, 2024. Abstract The expectation-maximization EM algorithm and its variants are widely used in statistics . In high-dimensional mixture linear regression , the model is assumed to be a finite mixture of linear regression and the number of predictors is much larger than the sample size . The standard EM algorithm , which attempts to find the maximum likelihood estimator , becomes infeasible for such model . We devise a group lasso penalized EM algorithm and

  • Fixed points of nonnegative neural networks

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Fixed points of nonnegative neural networks Tomasz J . Piotrowski , Renato L . G . Cavalcante , Mateusz Gabor 25(139 1 40, 2024. Abstract We use fixed point theory to analyze nonnegative neural networks , which we define as neural networks that map nonnegative vectors to nonnegative vectors . We first show that nonnegative neural networks with nonnegative weights and biases can be recognized as monotonic and weakly scalable mappings within the framework of nonlinear Perron-Frobenius theory . This fact enables us to provide conditions for the existence of fixed points of nonnegative neural networks

  • An Algorithmic Framework for the Optimization of Deep Neural Networks Architectures and Hyperparameters

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us An Algorithmic Framework for the Optimization of Deep Neural Networks Architectures and Hyperparameters Julie Keisler , El-Ghazali Talbi , Sandra Claudel , Gilles Cabriel 25(201 1 33, 2024. Abstract In this paper , we propose DRAGON for DiRected Acyclic Graph OptimizatioN an algorithmic framework to automatically generate efficient deep neural networks architectures and optimize their associated hyperparameters . The framework is based on evolving Directed Acyclic Graphs DAGs defining a more flexible search space than the existing ones in the literature . It allows mixtures of different classical

  • An Analysis of Quantile Temporal-Difference Learning

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us An Analysis of Quantile Temporal-Difference Learning Mark Rowland , Rémi Munos , Mohammad Gheshlaghi Azar , Yunhao Tang , Georg Ostrovski , Anna Harutyunyan , Karl Tuyls , Marc G . Bellemare , Will Dabney 25(163 1 47, 2024. Abstract We analyse quantile temporal-difference learning QTD a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning . Despite these empirical successes , a theoretical understanding of QTD has proven elusive until now . Unlike classical TD learning , which can be analysed

  • An Entropy-Based Model for Hierarchical Learning

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us An Entropy-Based Model for Hierarchical Learning Amir R . Asadi 25(187 1 45, 2024. Abstract Machine learning , the predominant approach in the field of artificial intelligence , enables computers to learn from data and experience . In the supervised learning framework , accurate and efficient learning of dependencies between data instances and their corresponding labels requires auxiliary information about the data distribution and the target function . This central concept aligns with the notion of regularization in statistical learning theory . Real-world datasets are often characterized by

  • Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis Yuanxing Chen , Qingzhao Zhang , Shuangge Ma , Kuangnan Fang 25(211 1 60, 2024. Abstract In diverse fields ranging from finance to omics , it is increasingly common that data is distributed with multiple individual sources referred to as clients” in some studies Integrating raw data , although powerful , is often not feasible , for example , when there are considerations on privacy protection . Distributed learning techniques have been developed to integrate summary statistics as opposed to raw data . In many existing

  • Individual-centered Partial Information in Social Networks

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Individual-centered Partial Information in Social Networks Xiao Han , Y . X . Rachel Wang , Qing Yang , Xin Tong 25(230 1 60, 2024. Abstract In statistical network analysis , we often assume either the full network is available or multiple subgraphs can be sampled to estimate various global properties of the network . However , in a real social network , people frequently make decisions based on their local view of the network alone . Here , we consider a partial information framework that characterizes the local network centered at a given individual by path length L$ and gives rise to a partial

  • From Small Scales to Large Scales: Distance-to-Measure Density based Geometric Analysis of Complex Data

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us From Small Scales to Large Scales : Distance-to-Measure Density based Geometric Analysis of Complex Data Katharina Proksch , Christoph Alexander Weikamp , Thomas Staudt , Benoit Lelandais , Christophe Zimmer 25(210 1 53, 2024. Abstract How can we tell complex point clouds with different small scale characteristics apart , while disregarding global features Can we find a suitable transformation of such data in a way that allows to discriminate between differences in this sense with statistical guarantees In this paper , we consider the analysis and classification of complex point clouds as they are

  • Statistical Optimality of Divide and Conquer Kernel-based Functional Linear Regression

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Statistical Optimality of Divide and Conquer Kernel-based Functional Linear Regression Jiading Liu , Lei Shi 25(155 1 56, 2024. Abstract Previous analysis of regularized functional linear regression in a reproducing kernel Hilbert space RKHS typically requires the target function to be contained in this kernel space . This paper studies the convergence performance of divide-and-conquer estimators in the scenario that the target function does not necessarily reside in the underlying RKHS . As a decomposition-based scalable approach , the divide-and-conquer estimators of functional linear regression

  • DoWhy-GCM: An Extension of DoWhy for Causal Inference in Graphical Causal Models

    Updated: 2024-08-31 23:46:18
    We present DoWhy-GCM, an extension of the DoWhy Python library, which leverages graphical causal models. Unlike existing causality libraries, which mainly focus on effect estimation, DoWhy-GCM addresses diverse causal queries, such as identifying the root causes of outliers and distributional changes, attributing causal influences to the data generating process of each node, or diagnosis of causal structures. With DoWhy-GCM, users typically specify cause-effect relations via a causal graph, fit causal mechanisms, and pose causal queries---all with just a few lines of code. The general documentation is available at https://www.pywhy.org/dowhy and the DoWhy-GCM specific code at https://github.com/py-why/dowhy/tree/main/dowhy/gcm.

  • Data-driven Automated Negative Control Estimation (DANCE): Search for, Validation of, and Causal Inference with Negative Controls

    Updated: 2024-08-31 23:46:18
    , , Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Data-driven Automated Negative Control Estimation DANCE Search for , Validation of , and Causal Inference with Negative Controls Erich Kummerfeld , Jaewon Lim , Xu Shi 25(229 1 35, 2024. Abstract Negative control variables are increasingly used to adjust for unmeasured confounding bias in causal inference using observational data . They are typically identified by subject matter knowledge and there is currently a severe lack of data-driven methods to find negative controls . In this paper , we present a statistical test for discovering negative controls of a special type---disconnected negative

  • PAMI: An Open-Source Python Library for Pattern Mining

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us PAMI : An Open-Source Python Library for Pattern Mining Uday Kiran Rage , Veena Pamalla , Masashi Toyoda , Masaru Kitsuregawa 25(209 1 6, 2024. Abstract Crucial information that can empower users with competitive information to achieve socio-economic development lies hidden in big data . Pattern mining aims to discover this needy information by finding user interest-based patterns in big data . Unfortunately , existing pattern mining libraries are limited to finding a few types of patterns in transactional and sequence databases . This paper tackles this problem by providing a cross-platform

  • Unsupervised Tree Boosting for Learning Probability Distributions

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Unsupervised Tree Boosting for Learning Probability Distributions Naoki Awaya , Li Ma 25(198 1 52, 2024. Abstract We propose an unsupervised tree boosting algorithm for inferring the underlying sampling distribution of an i.i.d . sample based on fitting additive tree ensembles in a manner analogous to supervised tree boosting . Integral to the algorithm is a new notion of addition on probability distributions that leads to a coherent notion of residualization i.e . subtracting a probability distribution from an observation to remove the distributional structure from the sampling distribution of the

  • A flexible empirical Bayes approach to multiple linear regression and connections with penalized regression

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us A flexible empirical Bayes approach to multiple linear regression and connections with penalized regression Youngseok Kim , Wei Wang , Peter Carbonetto , Matthew Stephens 25(185 1 59, 2024. Abstract We introduce a new empirical Bayes approach for large-scale multiple linear regression . Our approach combines two key ideas : i the use of flexible adaptive shrinkage priors , which approximate the nonparametric family of scale mixture of normal distributions by a finite mixture of normal distributions and ii the use of variational approximations to efficiently estimate prior hyperparameters and compute

  • Linear Regression With Unmatched Data: A Deconvolution Perspective

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Linear Regression With Unmatched Data : A Deconvolution Perspective Mona Azadkia , Fadoua Balabdaoui 25(197 1 55, 2024. Abstract Consider the regression problem where the response Y in mathbb{R and the covariate X in mathbb{R d$ for d geq 1$ are unmatched . Under this scenario , we do not have access to pairs of observations from the distribution of X , Y but instead , we have separate data sets Y_i i=1 n_Y and X_j j=1 n_X possibly collected from different sources . We study this problem assuming that the regression function is linear and the noise distribution is known , an assumption that we

  • Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning Ilnura Usmanova , Yarden As , Maryam Kamgarpour , Andreas Krause 25(171 1 54, 2024. Abstract Optimizing noisy functions online , when evaluating the objective requires experiments on a deployed system , is a crucial task arising in manufacturing , robotics and various other domains . Often , constraints on safe inputs are unknown ahead of time , and we only obtain noisy information , indicating how close we are to violating the constraints . Yet , safety must be guaranteed at all times , not only for the

  • Flexible Bayesian Product Mixture Models for Vector Autoregressions

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Flexible Bayesian Product Mixture Models for Vector Autoregressions Suprateek Kundu , Joshua Lukemire 25(146 1 52, 2024. Abstract Bayesian non-parametric methods based on Dirichlet process mixtures have seen tremendous success in various domains and are appealing in being able to borrow information by clustering samples that share identical parameters . However , such methods can face hurdles in heterogeneous settings where objects are expected to cluster only along a subset of axes or where clusters of samples share only a subset of identical parameters . We overcome such limitations by developing a

  • Risk Measures and Upper Probabilities: Coherence and Stratification

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Risk Measures and Upper Probabilities : Coherence and Stratification Christian Fröhlich , Robert C . Williamson 25(207 1 100, 2024. Abstract Machine learning typically presupposes classical probability theory which implies that aggregation is built upon expectation . There are now multiple reasons to motivate looking at richer alternatives to classical probability theory as a mathematical foundation for machine learning . We systematically examine a powerful and rich class of alternative aggregation functionals , known variously as spectral risk measures , Choquet integrals or Lorentz norms . We

  • More Efficient Estimation of Multivariate Additive Models Based on Tensor Decomposition and Penalization

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us More Efficient Estimation of Multivariate Additive Models Based on Tensor Decomposition and Penalization Xu Liu , Heng Lian , Jian Huang 25(161 1 27, 2024. Abstract We consider parsimonious modeling of high-dimensional multivariate additive models using regression splines , with or without sparsity assumptions . The approach is based on treating the coefficients in the spline expansions as a third-order tensor . Note the data does not have tensor predictors or tensor responses , which distinguishes our study from the existing ones . A Tucker decomposition is used to reduce the number of parameters in

  • Memory-Efficient Sequential Pattern Mining with Hybrid Tries

    Updated: 2024-08-31 23:46:18
    This paper develops a memory-efficient approach for Sequential Pattern Mining (SPM), a fundamental topic in knowledge discovery that faces a well-known memory bottleneck for large data sets. Our methodology involves a novel hybrid trie data structure that exploits recurring patterns to compactly store the data set in memory; and a corresponding mining algorithm designed to effectively extract patterns from this compact representation. Numerical results on small to medium-sized real-life test instances show an average improvement of 85% in memory consumption and 49% in computation time compared to the state of the art. For large data sets, our algorithm stands out as the only capable SPM approach within 256GB of system memory, potentially saving 1.7TB in memory consumption.

  • Permuted and Unlinked Monotone Regression in R^d: an approach based on mixture modeling and optimal transport

    Updated: 2024-08-31 23:46:18
    : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Permuted and Unlinked Monotone Regression in R^d : an approach based on mixture modeling and optimal transport Martin Slawski , Bodhisattva Sen 25(183 1 57, 2024. Abstract Suppose that we have a regression problem with response variable Y in mathbb{R d$ and predictor X in mathbb{R d$ , for d ge 1$ . In permuted or unlinked regression we have access to separate unordered data on X$ and Y$ , as opposed to data on X,Y pairs in usual regression . So far in the literature the case d=1$ has received attention , see e.g . the recent papers by Rigollet and Weed Information Inference , 8, 619-717 and

  • Transport-based Counterfactual Models

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Transport-based Counterfactual Models Lucas De Lara , Alberto González-Sanz , Nicholas Asher , Laurent Risser , Jean-Michel Loubes 25(136 1 59, 2024. Abstract Counterfactual frameworks have grown popular in machine learning for both explaining algorithmic decisions but also defining individual notions of fairness , more intuitive than typical group fairness conditions . However , state-of-the-art models to compute counterfactuals are either unrealistic or unfeasible . In particular , while Pearl's causal inference provides appealing rules to calculate counterfactuals , it relies on a model that is

  • On the Computational and Statistical Complexity of Over-parameterized Matrix Sensing

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us On the Computational and Statistical Complexity of Over-parameterized Matrix Sensing Jiacheng Zhuo , Jeongyeol Kwon , Nhat Ho , Constantine Caramanis 25(169 1 47, 2024. Abstract We consider solving the low-rank matrix sensing problem with the Factorized Gradient Descent FGD method when the specified rank is larger than the true rank . We refer to this as over-parameterized matrix sensing . If the ground truth signal mathbf{X in mathbb{R d times d is of rank r$ , but we try to recover it using mathbf{F mathbf{F top$ where mathbf{F in mathbb{R d times k and kr$ , the existing statistical analysis

  • Fat-Shattering Dimension of k-fold Aggregations

    Updated: 2024-08-31 23:46:18
    We provide estimates on the fat-shattering dimension of aggregation rules of real-valued function classes. The latter consists of all ways of choosing k functions, one from each of the k classes, and computing pointwise an "aggregate" function of these, such as the median, mean, and maximum. The bounds are stated in terms of the fat-shattering dimensions of the component classes. For linear and affine function classes, we provide a considerably sharper upper bound and a matching lower bound, achieving, in particular, an optimal dependence on k. Along the way, we improve several known results in addition to pointing out and correcting a number of erroneous claims in the literature.

  • Volterra Neural Networks (VNNs)

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Volterra Neural Networks VNNs Siddharth Roheda , Hamid Krim , Bo Jiang 25(182 1 29, 2024. Abstract The importance of inference in Machine Learning ML has led to an explosive number of different proposals , particularly in Deep Learning . In an attempt to reduce the complexity of Convolutional Neural Networks , we propose a Volterra filter-inspired Network architecture . This architecture introduces controlled non-linearities in the form of interactions between the delayed input samples of data . We propose a cascaded implementation of Volterra Filtering so as to significantly reduce the number of

  • Adaptive Latent Feature Sharing for Piecewise Linear Dimensionality Reduction

    Updated: 2024-08-31 23:46:18
    Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Adaptive Latent Feature Sharing for Piecewise Linear Dimensionality Reduction Adam Farooq , Yordan P . Raykov , Petar Raykov , Max A . Little 25(135 1 42, 2024. Abstract Linear Gaussian exploratory tools such as principal component analysis PCA and factor analysis FA are widely used for exploratory analysis , pre-processing , data visualization , and related tasks . Because the linear-Gaussian assumption is restrictive , for very high dimensional problems , they have been replaced by robust , sparse extensions or more flexible discrete-continuous latent feature models . Discrete-continuous latent

  • Map of job gains and losses

    Updated: 2024-08-30 08:33:57
    About Projects Courses Tutorials Newsletter Membership Log in Map of job gains and losses August 30, 2024 Topic Maps jobs New York Times To show the counties with more or fewer jobs when comparing 2023 to 2019, Ben Casselman and Ella Koeze for the New York Times use a county map with up and down arrows Green and up means a gain , whereas orange and down means a . loss We’ve seen similar maps with arrows , but they’re usually angled or swooped I guess I always assumed arrows going straight up and down would jumble together , but this seems to . work Related Shift in white population vs . people of color UK party gains and winners Voting gains for 2020, compared to 2016 election Get the Book Visualize This : The FlowingData Guide to Design , Visualization , and Statistics Available . now

  • ✚ Visualization Tools and Resources, August 2024 Roundup

    Updated: 2024-08-29 18:30:48
    , About Projects Courses Tutorials Newsletter Membership Log in Members Only Visualization Tools and Resources , August 2024 Roundup August 29, 2024 Topic The Process roundup Every month I collect tools and resources to help you make better charts . This is the good stuff for August 2024. To access this issue of The Process , you must be a . member If you are already a member , log in here See What You Get The Process is a weekly newsletter on how visualization tools , rules , and guidelines work in practice . I publish every Thursday . Get it in your inbox or read it on FlowingData . You also gain unlimited access to hundreds of hours worth of step-by-step visualization courses and tutorials which will help you make sense of data for insight and presentation . Resources include source

  • Air Quality Stripes

    Updated: 2024-08-27 09:07:11
    About Projects Courses Tutorials Newsletter Membership Log in Air Quality Stripes August 27, 2024 Topic Statistical Visualization air quality pollution In a riff on Climate Stripes which shows global temperature change as a color-coded barcode chart Air Quality Stripes uses a similar encoding to show pollution concentration from 1850 through 2021 Related Global warming color stripes , as decorative conversation starter Climate change in your lifetime and the next Climate spiral to show temperature change Get the Book Visualize This : The FlowingData Guide to Design , Visualization , and Statistics Available . now Order : Amazon Bookshop Become a . member Support an independent site . Make great charts . See what you get Projects by FlowingData See All Data , R , and a 3-D Printer We almost

  • Scammed out of life savings, a line chart

    Updated: 2024-08-26 09:45:44
    , About Projects Courses Tutorials Newsletter Membership Log in Scammed out of life savings , a line chart August 26, 2024 Topic Statistical Visualization annotation Bloomberg scam Annette Manes , a retired widow and single mother who saved by spending little , was scammed out of 1.4 million of her life savings . Bloomberg shows the large deposits and withdrawals through Manes’ JPMorgan checking account with a step chart There’s also a Sankey diagram to show the splits and bank-specific timelines that make you wonder why the banks’ systems didn’t start alerts . sooner The style of annotation and scrolling through time reminds me of the 2015 chart showing one person’s weight loss diary Related Why Line Chart Baselines Can Start at Non-Zero How to Untangle a Spaghetti Line Chart with R

  • Data GIF Maker lets you make animated GIFs with data

    Updated: 2024-08-23 17:16:00
    About Projects Courses Tutorials Newsletter Membership Log in Data GIF Maker lets you make animated GIFs with data August 23, 2024 Topic Apps animation GIF Google The Data GIF Maker is a fun tool from Google that lets you add movement to a handful of straightforward charts . Enter up to five values and select from four chart types . Get a downloadable GIF that you can stick in a . presentation Here’s a small example for a stacked bar with two : values The charts are simple and the animations are just bringing the data into view . I prefer to show charts outright , but if this is your thing , the tool makes it really . easy Related How to Make Animated Visualization GIFs with ImageMagick Microsoft’s visual data explorer SandDance open sourced Figure skating animated jumps Get the Book

  • JavaScript Gantt Chart with Custom Data Grid Header Font — JS Chart Tips

    Updated: 2024-08-20 05:42:39
    Hey everyone! We’re excited to launch a new regular feature on our blog called JS Chart Tips. In this series, we’ll share some recent cases handled by our Support Team for users of our JavaScript charting library, highlighting both frequent questions and those unique solutions that shouldn’t remain hidden. Whether these scenarios directly resonate with […] The post JavaScript Gantt Chart with Custom Data Grid Header Font — JS Chart Tips appeared first on AnyChart News.

  • Unlocking Visual Data Insights — DataViz Weekly

    Updated: 2024-08-16 19:28:39
    Data speaks louder when it’s represented graphically. Unlock the power of visual data insights in our new edition of DataViz Weekly, putting a spotlight on new charts and maps that make trends and patterns clear and engaging. Take a look at the projects that have stood out to us this week: K-pop’s global reach through […] The post Unlocking Visual Data Insights — DataViz Weekly appeared first on AnyChart News.

Current Feed Items | Previous Months Items

Jul 2024 | Jun 2024 | May 2024 | Apr 2024 | Mar 2024 | Feb 2024